Skip to content

msl: fix cooperative matrix loading semantics#9451

Open
kvark wants to merge 1 commit intogfx-rs:trunkfrom
kvark:msl-coop-mat
Open

msl: fix cooperative matrix loading semantics#9451
kvark wants to merge 1 commit intogfx-rs:trunkfrom
kvark:msl-coop-mat

Conversation

@kvark
Copy link
Copy Markdown
Member

@kvark kvark commented Apr 20, 2026

Connections
Following up on support from #8251.

Description
Looks like this was broken and it slipped through because the example was using a symmetric matrix.
MSL loaded cooperative matrix data transposed compared to SPV.
The internal spec document also needs to be fixed - specifying that it's ColMajor by default is consistent with the rest of matrix logic in WGSL.

Testing
Example, updated test

Squash or Rebase?

Squash is fine.

Checklist

  • I self-reviewed and fully understand this PR.
  • WebGPU implementations built with wgpu may be affected behaviorally.
  • Validation and feature gates are in place to confine behavioral changes.
  • Tests demonstrate the validation and altered logic works.
  • CHANGELOG.md entries for the user-facing effects of this change are present.
  • The PR is minimal, and doesn't make sense to land as multiple PRs.
  • Commits are logically scoped and individually reviewable.
  • The PR description has enough context to understand the motivation and solution implemented.

The naga MSL backend forwarded the WGSL `coopLoad(T)` / `coopStore(T)`
`row_major` flag straight through to Metal's `simdgroup_load` /
`simdgroup_store` `transpose` argument. Metal's `transpose` means
"memory is transposed relative to the simdgroup_matrix canonical layout"
— and on Apple silicon that canonical layout is row-major, so
`transpose=false` is the correct load for row-major memory. Naga's IR
uses `row_major=true` to mean "memory is row-major"; the emission must
therefore negate the flag when targeting Metal, matching the SPIRV
backend which already correctly maps `row_major=true → RowMajorKHR`.

Symptom: any WGSL using `coopLoadT` / `coopStoreT` on row-major memory
produced numerically-wrong results on Apple silicon. The existing
wgpu cooperative-matrix example masked this because it used `coopLoad`
on row-major memory (the matching error cancelled the MSL one on
Metal), and because its A-matrix init happened to be symmetric so the
corresponding bug on Vulkan was also invisible. Neither regression
path will be silent after this commit:

 - `naga/src/back/msl/writer.rs` emits the correct `transpose` flag for
   `coopLoad` / `coopStore` and `coopLoadT` / `coopStoreT`. The
   snapshot in `naga/tests/out/msl/wgsl-cooperative-matrix.metal` is
   regenerated accordingly (2-line diff).
 - `examples/features/.../shader.wgsl` now uses `coopLoadT` /
   `coopStoreT` so that the WGSL matches the "Row-major tiles"
   comment in the surrounding host code. This also serves as a working
   reference for the naga convention.
 - `examples/features/.../mod.rs` reinitialises matrix A with a
   row/column-weighted formula so neither A nor B is symmetric in
   (i, j); the test can no longer pass by coincidence regardless of
   which load variant a future regression picks.
 - `docs/api-specs/cooperative_matrix.md` documents both variants
   (`coopLoad` = column-major, `coopLoadT` = row-major), updates the
   worked example to use the T variants with its row-major buffers,
   and adds "variant does not match storage" as an undefined-behaviour
   case.
@kvark
Copy link
Copy Markdown
Member Author

kvark commented Apr 20, 2026

@jimblandy this is quite important to land ASAP. I was having issues going between Linux and MacOS development, and now realized the implementation had been inconsistent. Need to pick this up to continue cross-platform development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants